-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QBFT: Fix validation of proposal based on older round's prepared block #7875
Conversation
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it easy to write a test for this? Would be nice to add a regression test if possible. Will add an approval once I've ready through the forCommittedSeal
vs forOnchainBlcok
code a bit more thoroughly.
consensus/qbft/src/main/java/org/hyperledger/besu/consensus/qbft/statemachine/QbftRound.java
Outdated
Show resolved
Hide resolved
Just reading through the code and the detailed explanation you've outlined in the PR, the one thing I wanted to check related to this:
Your fix sounds like it address this by effectively ignoring the round number. I'm wondering if there's an alternative fix which is to make sure the round number in the proposal message is correct for the new round? @jframe I wonder if you've got time to take a look at this PR that Bhanu has raised to see what you think? |
Yes I was thinking about the test , will check if it's relatively easy to add one. |
That's true, I got it incorrect about inclusion of round number . Currently (before this PR) prepared block and proposal both don't include round number in hash calculation . So , as far as validation goes , it should only be checked if prepared and proposed blocks hashes are equal (without any round number replacement) , which happens to be the case based on my experimentation. Prepared round and proposal round info are in metadata . |
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Made changes to fix the other way as proposed above - adding round numbers to new block proposals and updating the round number proposals based on prepared blocks to current round. updates are made in the block header . |
Tested and verified the fix manually , will continue looking into adding a testcase |
- Check the result of proposer's self proposal validation before sending prepare message - Update roundchange testcases to include round numbers in blocks Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Figured that there's already a testcase which could have caught this error - Line 172 in 40b27a1
It throws the validation error in logs 2024-11-14 16:02:16.999-05:00 | Test worker | INFO | RoundChangeManager | Address: 0xb02c0093e94b417620f24d8857d6a1dd3d230272 Round: 4
2024-11-14 16:02:17.013-05:00 | Test worker | INFO | RoundTimer | BFT round 3 expired. Moved to round 4 which will expire in 192 seconds
2024-11-14 16:02:17.029-05:00 | Test worker | INFO | ProposalValidator | Invalid Proposal Payload: Latest Prepared Metadata blockhash does not align with proposed block
Task :consensus:qbft:integrationTest but doesn’t fail due to another bug in the code - while proposer handles their own proposal, it doesn’t check if the proposal is valid before sending prepare. Just fixing that made the above test fail. Then included the change to add round number in block header extradata, corresponding test case changes and verified all the testcases in consensus/qbft/src/integration-test/java/org/hyperledger/besu/consensus/qbft/test pass. |
This feels like a better fix to me @pullurib, thanks for tweaking it and for the extra tests. |
Since this fixes a number of issues, think it's worth adding a changelog entry @pullurib |
Signed-off-by: Bhanu Pulluri <bhanu.pulluri@kaleido.io>
Done, thanks |
PR description
From the logs, this particular occurence of the bug happened when transaction selection reached the time limit and then the round ran out of time before commit.
There was a block proposed and prepared but not committed so it was propagated to new round .
The new proposer identifies the Round Change (RC) message with the highest prepared round and bases a new proposal on the attached prepared message for that highest round. The proposal’s block header reflects the current round number, but the prepared message still references the prior round. In the code, an attempt is made to match the proposal’s block header round with the highest prepared round for validation by comparing the hashes of the proposed block and the underlying prepared block. However, the hashing method in BlockHeaderFunctions used is incorrect, as it replaces the round with 0.
As a result, validators continue sending RC messages with their already prepared blocks, causing this error to repeat and stall the network, preventing consensus progress. This issue occurs only when the highest prepared round is greater than 0—meaning consensus has moved to round 2 or beyond, using a prepared message from round 1 or later.
So made change to use the right hashing method which doesn’t touch the round number.
I was able to repro the error by forcing one of the nodes to not send commit message in round 0 and round 1 .
Fixed Issue(s)
#6732 and may be #6613 and may be #6053